Select R package that you’d like to conduct the analysis with from the “Select R package” pulldown list.
To upload a dataset, use the ‘Browse’ button in the “Choose file to upload” field.
Hint: the dataset must be fully processed and contain the initial clustering information!
Hint2: for RaceID, the dataset must be preprocessed with version 3 of the package. For Monocle, with version 2.
This vignette showcases the use of a dataset from a custom path.
A published dataset stored under “/data/processing/scRNAseq_shiny_app_example_data/GSE81076_raceid.workspaceR/sc.minT1000.RData” will be analyzed. See Grün D, Muraro MJ, Boisset JC, Wiebrands K et al. De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data. Cell Stem Cell 2016 Aug 4;19(2):266-277 for the original publication.
## Warning: package 'captioner' was built under R version 3.5.1
Select RaceID3 as analysis R package. Upload the dataset (wait until complete) and click on ‘Select dataset’ (Figure 1).
Figure 1: R package selection and dataset upload
After some lag, the head of the normalized data appears in the “Input Data” tab (Figure 2). You can also check the dimensions of your matrix and the summary of the TPC (transcript per cell) distribution in the corresponding boxes.
Figure 2: Head of normalized counts and data summary
In the “tSNE map and clustering” tab, a plot of the within-cluster dispersion as a function of cluster number will appear in the “Metrics for cluster number selection” box (Figure 3). Silhoutte plot illustrating cluster assignment quality as well a cluster membership tsne plot for the preselected number of clusters are displayed for the loaded dataset. Use this information to guide your cluster number choice as described in the package vignette.
The dataset was originally clustered into 6 clusters.
Figure 3: Cluster quality metrics for loaded dataset
You decide to change the number of clusters to e.g. 3. Update the value on the ruler and click on ‘Update cluster plots’. This initiates re-clustering (Figure 4), and after a waiting time, the updated tsne and silhouette plots replace the old plots (Figure 5).
Figure 4: Update cluster number choice
Figure 5: Plots for updated cluster number
To obtain markers (by default: 2) for each cluster, click on ‘Get marker genes’ in the bottom half of the page (Figure 6). After a (rather long) while, a table with top markers as well as a heatmap corresponding to it appears (Figure 7).
Figure 6: Request top marker genes
Figure 7: Top marker genes result
To increase the number of markers displayed in the table and on the heatmap, move the ruler above the table. The two outputs will be updated (Figure 8).
Figure 8: Update the number of marker genes displayed
You can download the marker table, use the ‘Download table’ button (Figure 9).
Figure 9: Download cluster marker table
In the “Marker Gene Visualization” tab, you may plot expression of selected genes, as long as they are expressed in at least 1 cell in the dataset. To select a gene, copy one of the top markers into the “GeneID” field in the box and click on ‘Select genes’ (Figure 10).
Figure 10: Select gene IDs for visualization
Check that the gene(s) is(are) expressed in the ‘Genes used’ field (Figure 10).
Modify plot title and expression scale if needed, and click on ‘Plot tsne map’ to visualise gene expression for that gene(s) (Figure 11).
Figure 11: Tsne map with marker gene expression
In the “Correlation Analyses” tab, you may query your dataset for the genes most correlated to your genes of interest and obtain pairwise gene expression plot. Again, enter a gene ID in the side box and click on “Select genes” button in this tab (Figure 12).
Figure 12: Select gene IDs for correlation analysis
A violin plot of the pearson correlation calculated for log2-transformed counts will appear, alongside a list of top10 genes with the highest absolute correlation to the selected genes (Figure 13).
Figure 13: Display top correlated genes
To plot pairwise correlation for selected genes, enter gene IDs into the boxes collecting information for X and Y axes in the bottom half of the page, adjust the plot title if necessary, and click on the “Plot expression” button (Figure 14).
Figure 14: Select gene IDs for pairwise expression plot
Pairwise plot of normalized counts will appear (Figure 15).
Figure 15: Pairwise expression plot
A published dataset stored under “/data/processing/scRNAseq_shiny_app_example_data/GSE81076_monocle.workspaceR/minT5000.mono.set.RData” will be analyzed. See Grün D, Muraro MJ, Boisset JC, Wiebrands K et al. De Novo Prediction of Stem Cell Identity using Single-Cell Transcriptome Data. Cell Stem Cell 2016 Aug 4;19(2):266-277 for the original publication.
Select Monocle as analysis package. Upload dataset (wait till complete) and click on ‘Select dataset’ (Figure 16).
Figure 16: R package selection and dataset upload
After some lag, the head of the normalized data appears in the “Input Data” tab (Figure 17). You can also check the dimensions of your matrix and the summary of the TPC (transcript per cell) distribution in the corresponding boxes.
Figure 17: Head of normalized counts and dataset summary
In the “tSNE map and clustering” tab, a plot of delta (distance) versus rho (density) will appear in the “Metrics for cluster number selection” box (Figure 18). Silhoutte plot illustrating cluster assignment quality as well a cluster membership tsne plot for the preselected number of clusters are displayed for the loaded dataset. Use this information to guide your cluster number choice as described in the package vignette.
The dataset was originally clustered into 17 clusters (Figure 18).
Figure 18: Cluster quality metrics for loaded dataset
You decide to change the number of clusters to e.g. 3. Update the value on the ruler and click on ‘Update cluster plots’. This initiates re-clustering (Figure 19), and after a waiting time, the updated tsne and silhouette plots replace the old plots (Figure 20).
Figure 19: Update cluster number choice
Figure 20: Plots for updated cluster number
To obtain markers (by default: 2) for each cluster, click on ‘Get marker genes’ in the bottom half of the page (Figure 21). After a (rather long) while, a table with top markers as well as a heatmap corresponding to it appears (Figure 22). This calculation is done using the Bioconductor scRNAseq analysis package ‘Seurat’.
Figure 21: Request top marker genes
Figure 22: Top marker genes result
To increase the number of markers displayed in the table and on the heatmap, move the ruler above the table. The two outputs will be updated (Figure 23).
Figure 23: Update the number of marker genes displayed
You can download the marker table, use the ‘Download table’ button (Figure 24).
Figure 24: Download cluster marker table
In the “Marker Gene Visualization” tab, you may plot expression of selected genes, as long as they are expressed in at least 1 cell in the dataset. To select a gene, copy one of the top markers into the “GeneID” field in the box and click on ‘Select genes’ (Figure 25).
Figure 25: Select gene IDs for visualization
Check that the gene(s) is(are) expressed in the ‘Genes used’ field (Figure 25).
Modify plot title and expression scale if needed, and click on ‘Plot tsne map’ to visualise gene expression for that gene(s) (Figure 26).
Figure 26: Tsne map with marker gene expression
In the “Correlation Analyses” tab, you may query your dataset for the genes most correlated to your genes of interest and obtain pairwise gene expression plot. Again, enter a gene ID in the side box and click on “Select genes” button in this tab (Figure 27).
Figure 27: Select gene IDs for correlation analysis
A violin plot of the pearson correlation calculated for log2-transformed counts will appear, alongside a list of top10 genes with the highest absolute correlation to the selected genes (Figure 28).
Figure 28: Display top correlated genes
To plot pairwise correlation for selected genes, enter gene IDs into the boxes collecting information for X and Y axes in the bottom half of the page, adjust the plot title if necessary, and click on the “Plot expression” button (Figure 29).
Figure 29: Select gene IDs for pairwise expression plot
Pairwise plot of normalized counts will appear (Figure 30).
Figure 15: Pairwise expression plot
To keep trace of the parameters you used to generate your plots, it is recommended that you code them either into the plot titles (customizable by the user) or into the file names under which you save your plots.
To keep trace of the R and R packages versions, you might want to inspect the ‘sessionInfo’ tab. This contains the output of the sessionInfo() R command (Figure 31). At the bottom of the page, two buttons are available (Figure 32). Click on ‘Download session info’ or ‘Download your data’ to save the respective file on your computer.
Figure 31: Session Info tab
Figure 32: Download documentation and modified dataset
Lastly, the code behind the app can be retrieved under “https://github.com/maxplanck-ie/scRNAseq_shiny_app” for the given version of the app. The latter you can read at the bottom of the side bar (Figure 33).
Figure 33: App version